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Abstract 

The wiretap channel is a setting where one aims to provide information-theoretic privacy of 
communicated data based solely on the assumption that the channel from sender to adversary is 
"noisier" than the channel from sender to receiver. It has been the subject of decades of work in 
the information and coding (I&C) community. This paper bridges the gap between this body of 
work and modern cryptography with contributions along two fronts, namely metrics (definitions) 
of security, and schemes. We explain that the metric currently in use is weak and insufficient to 
guarantee security of applications and propose two replacements. One, that we call mis-security, is 
a mutual-information based metric in the I&C style. The other, semantic security, adapts to this 
setting a cryptographic metric that, in the cryptography community, has been vetted by decades 
of evaluation and endorsed as the target for standards and implementations. We show that they 
are equivalent (any scheme secure under one is secure under the other), thereby connecting two 
fundamentally different ways of defining security and providing a strong, unified and well-founded 
target for designs. Moving on to schemes, results from the wiretap community are mostly non- 
constructive, proving the existence of schemes without necessarily yielding ones that are explicit, 
let alone efficient, and only meeting their weak notion of security. We apply cryptographic methods 
based on extractors to produce explicit, polynomial-time and even practical encryption schemes 
that meet our new and stronger security target. 
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1 Introduction 



This paper aims to bridge the gap between two communities. The first, within information and coding 
(I&C), is the wiretap channel community, and the second is the modern cryptographic community 

The wiretap channel is a setting where one aims to communicate data with information-theoretic 
security under the sole assumption that the channel from sender to adversary is "noisier" than the 
channel from sender to receiver. Introduced by Wyner, Csiszar and Korner in the late seventies |451ll4j. 
it has developed in the I&C community over the last 30 years divorced from the parallel development 
of Modern Cryptography. Yet the questions, centering as they are on data security, are at heart 
cryptographic. 

The first element of the gap is definitions. We explain that the security definition in current use, 
that we call mis-r (mutual-information security for random messages) is weak and insufficient to provide 
security of applications. We suggest strong, new definitions. One, that we call mis (mutual-information 
security), is an extension of mis-r and thus rooted in the I&C tradition and intuition. Another, 
semantic security, adapts a definition of [19} [2] that, vetted by decades of cryptographic research and 
targeted by standards and schemes deployed in practice, is the cryptographic gold standard. We 
prove the two equivalent, thereby connecting two fundamentally different ways of defining privacy and 
providing a new, strong and well-founded target for constructions. 

The second element of the gap is techniques. The I&C community tends to view security as an add- 
on to error-correction, starting from error-correcting codes (ECCs) and tweaking the designs to confer 
security. Their results are mostly non-constructive, proving existence of schemes whose algorithms 
may not be efficient. Meanwhile, complexity-theorists and cryptographers have developed tools such 
as extractors which would seem eminently useful in this domain, even more so when one aims to achieve 
the new and more stringent definitions of privacy that we define. With this starting point, we take 
the opposite approach, making error-correction an add-on to security. We start from cryptographic 
tools to provide the security, error-correction being done later for no purpose other than correcting 
errors. We obtain constructive results which yield practical schemes to achieve all the security goals 
we define. Let us now look at all this in more detail. 

The wiretap model. The setting is depicted in Figure [TJ The sender applies to her message M 
a randomized encryption function £: {0, l} m — > {0, 1} C to get what we call the sender- ciphertext 
This is transmitted to the receiver over the receiver channel ChR so that the latter 
gets a receiver ciphertext Y <— $ ChR(X) which it decrypts via algorithm T> to recover the message. The 
adversary's wiretap is modeled as another channel ChA and it accordingly gets an adversary ciphertext 
Z <— $ ChA(X) from which it tries to glean whatever it can about the message. 

A channel is a randomized function specified by a transition probability matrix W where W[a;,y] 
is the probability that input x results in output y. Here x, y are strings. Thus, for example, we regard 
the Binary Symmetric Channel BSC P with crossover probability p < 1/2 as taking a binary string x of 
any length and returning the string y of the same length formed by flipping each bit of x independently 
with probability p. For concreteness and simplicity of exposition we will often phrase discussions in 
the setting where ChR, ChA are BSCs with crossover probabilities Pr,pa < 1/2 respectively, but our 
results apply in much greater generality. In this case the assumption that ChA is "noisier" than ChR 
corresponds to the assumption that pr < pa- This is the only assumption made: the adversary is 
computationally unbounded, and the scheme is keyless, meaning sender and receiver are not assumed 
to a priori share any information not known to the adversary. 

Previous work. The setting now has a literature, in the I&C community, encompassing hundreds 
of papers. (See the survey [27] or the book [5].) Schemes must satisfy two conditions, namely decoding 
and security. The decoding condition asks that the scheme provide error-correction over the receiver 
channel, namely lim m _ 5>00 Pr[2?(ChR(£(M))) ^ M] = 0. The original security condition of |3S] was 

1 (The notation y <— $ A(x) means that we run randomized function A on input x and denote the output by y. 
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Figure 1: Wiretap channel model. See text for explanations. 



that linifc^oo I(M; ChA(£(M))/m = where k is an underlying security parameter of which m,c are 
functions, the message random variable M is uniformly distributed over {0, l} m and I(M; Z) = H(M) — 
H(M | Z) is the mutual information. This was critiqued by [32, 36| who put forth the stronger security 
condition now in use, namely that lim^oo I(M; ChA(£(M)) = 0. (The random variable M continues 
to be uniformly distributed over {0, l}" 1 -) The literature seeks to minimize the rate Rate(£ ) = m/c 
of the scheme. 

Shannon's seminal result [5T] says that if we ignore security and merely consider achieving the 
decoding condition then the maximum achievable rate Rate(£) is the receiver channel capacity, which 
in the BSC case is 1 — /i2(pr) where hi is the binary entropy function h^ij)) = — plg(p) — (1— p) lg(l — p). 
He gave non-constructive proofs of existence of schemes meeting capacity. 

Coming in with this background and the added security condition, it was natural for the wiretap 
community to follow Shannon's lead and begin by asking what is the maximum achievable rate, now 
subject to both the security and decoding conditions. This optimal rate is called the secrecy capacity 
and, in the case of BSCs, equals the difference (1 — /^(pj?)) — (1 — ^2(pa)) = ^2(pa) — ^2(Pi?) hi 
capacities of the receiver and adversary channels. Non-constructive proofs of the existence of schemes 
with this optimal rate were given in |45t [T4"l [6] . A lot of work has followed aiming to establish similar 
results for other channels. Little attention has been given to finding explicit schemes with efficient 
encoding and decoding. 

Context. Practical interest in the wiretap setting is escalating. Its proponents note two striking 
benefits over conventional cryptography: (1) no computational assumptions, and (2) no keys and 
hence no key distribution. Item (1) is attractive to governments who are concerned with long-term 
security and worried about quantum computing. Item (2) is attractive in a world where vulnerable, 
low-power devices are proliferating and key-distribution and key-management are unsur mount able 
obstacles to security. The practical challenge is to realize a secrecy capacity, meaning ensure by 
physical means that the adversary channel is noisier than the receiver one. The degradation with 
distance of radio communication signal quality is the basis of several approaches being investigated 
for wireless settings. Government-sponsored Ziva Corporation [46] is using optical techniques to build 
a receiver channel in such a way that wiretapping results in a degraded channel. A program called 
Physical Layer Security aimed at practical realization of the wiretap channel is the subject of books [5] 
and conferences [22J. All this activity means that schemes are being sought for implementation. If 
so, we need privacy definitions that yield security in applications, and we need constructive results 
yielding practical schemes achieving privacy under these definitions. This is what we aim to supply. 

Definitions. A security metric xs associates to encryption function £ : {0, l} m — > {0, 1} C and ad- 
versary channel ChA a number Adv xs (<? ; ChA) that measures the maximum "advantage" of an ad- 
versary in breaking the scheme under metric xs. For example, the metric underlying the current, 
above-mentioned security condition is Adv mis ~ r (£; ChA) = I(M; ChA(£(M))) where M is uniformly 
distributed over {0, l} m . We call this the mis-r (mutual-information security for random messages) 
metric because messages are assumed to be random. From the cryptographic perspective, this is ex- 
traordinarily weak, for we know that real messages are not random. (They may be files, votes or any 
type of structured data, often with low entropy. Contrary to a view in the I&C community, compres- 
sion does not render data random, as can be seen from the case of votes, where the message space 
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1 £ds < \/2e m i s Theorem 14.51 

2 e mis < 2e ds lg Theorem S3] 

3 e ss < ea s Theorem 14.11 

4 £ds < 2e ss Theorem 14.11 



Figure 2: Relations between notions. An arrow A — > B is an implication, meaning every scheme 
that is A-secure is also B-secure, while a barred arrow A B is a separation, meaning that there is 
a A-secure scheme that is not B-secure. If £ : {0, l} m — > {0, 1} C is the encryption function and ChA 
the adversary channel, we let e xs = Adv xs (£; ChA). The table then shows the quantitative bounds 
underlying the annotated implications. 
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has very low entropy.) This leads us to suggest a stronger metric that we call mutual-information se- 
curity, defined via Adv mis (£; ChA) = max^ I(M; ChA(£(M))) where the maximum is over all random 
variables M over {0, l} m , regardless of their distribution. 

At this point, we have a legitimate metric, but how does it capture privacy? The intuition is 
that it is measuring the difference in the number of bits required to encode the message before and 
after seeing the ciphertext. This intuition is alien to cryptographers, whose metrics are based on 
much more direct and usage-driven privacy requirements. Cryptographers understand since [19] that 
encryption must hide all partial information about the message, meaning the adversary should have 
little advantage in computing a function of the message given the ciphertext. (Examples of partial 
information about a message include its first bit or even the XOR of the first and second bits.) The 
mis-r and mis metrics ask for nothing like this and are based on entirely different intuition. We extend 
Goldwasser and Micali's semantic security [19] definition to the wiretap setting, defining 

Adv ss (£;ChA) = max ( maxPr[.A(ChA(£ (M))) = /(M)] - maxPr[<S(m) = /(M)] 

Within the parentheses is the maximum probability that an adversary A, given the adversary cipher- 
text, can compute the result of function / on the message, minus the maximum probability that a 
simulator S can do the same given only the length of the message. We also define a distinguishing 
security (ds) metric as an analog of indistinguishability [19] via 

Adv ds (£;ChA) = max 2 PrL4(M , Mi, ChA(£(M b ))) = b] - 1 

A, Mo, Mi 

where challenge bit b is uniformly distributed over {0, 1} and the maximum is over all m-bit mes- 
sages Mo, Mi and all adversaries A. For any metric xs, we say £ provides XS-security over ChA if 
lim^oo Adv xs (£;ChA) = 0. 

Relations. The mutual information between message and ciphertext, as measured by mis, is, as 
noted above, the change in the number of bits needed to encode the message created by seeing the 
ciphertext. It is not clear why this should measure privacy in the sense of semantic security. Yet 
we are able to show that mutual- information security and semantic security are equivalent, meaning 
an encryption scheme is MIS-secure if and only if it is SS-secure. Figure [2] summarizes this and 
other relations we establish that between them settle all possible relations. The equivalence between 
SS and DS is the information-theoretic analogue of the corresponding well-known equivalence in the 
computational setting [19, 2j. As there, however, it brings the important benefit that we can now 
work with the technically simpler DS. We then show that MIS implies DS by reducing to Pinsker's 
inequality. We show conversely that DS implies MIS via a general relation between mutual information 
and statistical distance. As Figure [2] indicates, the asymptotic relations are all underlain by concrete 
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quantitative and polynomial relations between the advantages. 

We show that in general MIS-R does not imply MIS, meaning the former is strictly weaker than 
the latter. We do this by exhibiting an encryption function £ and channel ChA such that £ is MIS- 
R-secure relative to ChA but MIS-insecure relative to ChA. Furthermore we do this for the case that 
ChA is a BSC. However in Section [4.61 we show that for certain encryption schemes, namely ones that 
are separable and message-linear, and for certain channels, MIS-R security does imply MIS-security. 
This somewhat surprising result has been exploited in |29[ 128] to build MIS-secure schemes. 

Schemes. We improve over existing schemes in several ways. First, while previous work targeted 
MIS-R security, we target and achieve the stronger MIS, DS and SS goals. Second, while previous 
results were largely proving existence of schemes without giving explicit ones, our schemes are not 
only explicit but have efficient and simple encryption and decryption and may be used in the practical 
settings that are now emerging in this area. Third, the methods of previous work were ECC-intrusive, 
taking ECCs and modifying them to add security. Our approach is modular, combining extractors 
with existing ECCs. 

A common misconception is to think that privacy and error-correction may be completely de- 
coupled, meaning one would first build a scheme that is secure when the receiver channel is noiseless 
and then add an ECC on top to meet the decoding condition with a noisy receiver channel. This does 
not work because the error-correction helps the adversary by reducing the noise over the adversary 
channel. The two requirements do need to be considered together. We are still, however, able to provide 
a modular approach. First designing secure schemes for the case of a noiseless receiver channel, we 
then put ECCs on top for error-correction, but we are able to state certain simple conditions on the 
ECCs that suffice for preservation of security, and provide non-intrusive analyses showing that most 
ECCs have these properties. 

Let %: {0, l} h x {0, 1}" — > {0, l} m be a universal hash function. In the case of a noiseless receiver 
channel, our scheme, given a m-bit message M, picks at random a /i-bit string H and n-bit string 
U, and sets the sender ciphertext to U\\H\\T-l(H, U)®M. Using the extraction properties of universal 
hash functions established by the Leftover Hash Lemma [20] and its generalization |16[ [15] we can 
bound the DS-advantage of this scheme in terms of the probability that the adversary, given the 
adversary ciphertext, recovers U . This probability can in turn be bounded for specific channels like 
the BSC. In the case the receiver channel has error, we apply an ECC to U to get U' , apply an 
ECC to H\\H(H,U)®M to get V and let the sender ciphertext be U'\\V. In this case we bound 
the DS-advantage of this scheme in terms of what we call the rs-r-advantage of the first ECC, which 
is the probability that an adversary, given the result of the adversary channel on U', can recover U. 
This probability too can be bounded directly for BSCs. We also provide tools to bound it for other 
channels. Our approach for the case of a noisy receiver channel is inspired by the concepts of secure 
sketches and fuzzy extractors |16[ [15] , translating ideas there to the wiretap setting. 

Related WORK. Mahdavifar and Vardy [28, 29J provide an explicit MIS-R-secure scheme with optimal 
rate, meaning rate equal to the MIS-R secrecy capacity. But they give no proof that decryption 
(decoding) is possible for their scheme, even in principle let alone in polynomial time. The central 
open question in the wiretap channel community was whether there is a polynomial time (this means 
both encryption and decryption are polynomial time) MIS-R secure scheme with optimal rate. Our 
schemes achieve MIS-R security in polynomial time but do not have optimal rate, so we do not answer 
this question. The question has finally been settled by [3], who provide a polynomial-time MIS-R 
secure scheme with optimal rate. 

Mahdavifar and Vardy [28} [29] apply our results from Section 14.61 to conclude that their MIS-R 
secure scheme also provides MIS (and thus by our results DS and SS) security for certain channels. 
This shows that the optimal rate for DS-security is the same as for MIS-R security for these channels. 
The question this raises is whether one can achieve DS-security at this optimal rate in polynomial 
time. This question too is resolved by [3], whose above-mentioned scheme in fact directly achieves 



6 



DS-security. 

Appendix [A] provides a comprehensive survey of the large body of work related to wiretap security, 
and more broadly, to information-theoretic secure communication in a noisy setup [30] . 

2 Preliminaries 

Basic notation and definitions. If s is a binary string then s[i] denotes its i-th bit and |s| denotes 
its length. If S is a set then \S\ denotes its size. If x is a real number then \x\ denotes its absolute 
value. If si, . . . , si are strings then si|| • • • \\si denotes their concatenation. If s is a string and n a 
non- negative integer then s n denotes the concatenation of n copies of s. 

A probability distribution is a function P that associates to each x a probability P{x) £ [0, 1]. The 
support SUPp(P) is the set of all x such that P(x) > 0. All probability distributions in this paper 
are discrete. Associate to random variable X and event E the probability distributions Px,Px\E 
defined for all x by Px( x ) = Pr[X = x] and Px\e( x ) = Pr[X = x | E\. We denote by lg(-) the 
logarithm in base two, and by ln(-) the natural logarithm. We adopt standard conventions such as 
OlgO = Olgoo = and Pr[Pi|P 2 ] = when Pr[P 2 ] = 0. The function h: [0, 1] -> [0, 1] is defined by 
h{x) = — x\gx. The (Shannon) entropy of a probability distribution P is defined by 

h(p) = y, Kn*)) = - E p (*) p ( x ) 

X X 

The (Shannon) entropy of a random variable X is defined by 

H(X) = H(P x ) = Y J h(Px(x)). 

X 

The statistical difference between probability distributions P, Q is defined by 

SD(P;g) = i-^;|P(x)-Q(x)| . 

X 

The statistical difference between random variables X, Y is defined by 

SD(Xi;X 2 ) = SD(P Xl ; P X2 ) = 1 • ^ I Pr[*i Pr[X 2 = x}\ . 

X 

If X, Y are random variables the conditional entropy is defined via 

H(X|Y) = E^Y(y)-H(X|Y = y) where H(X | Y = y) = £ h(P^ Y=v (x)) . 
y ^ 
The mutual information between random variables X, Y is defined by 

I(X;Y) = H(X) - H(X| Y) . 

The guessing-probability GP and min-entropy Hqo of a random variable X are defined via 

GP(X) = maxPr[X = x] = 2~ Hoo(x) . 

X 

The average guessing probability GP(X|Z) and average min-entropy H oc (X|Z) are defined for random 
variables X, Z via 

GP(XIZ) = yPr[Z = zl-maxPr[X = x I Z = z] = 2~ H °° (X|Z) . 

^ ' X 

z 

Transforms. We say that T is a transform with domain D and range R, written T; D —> R, if T(x) 
is a random variable over R for every x G D. Thus, T is fully specified by a sequence P = {P X } X <=D of 
probability distributions over R, where P x (y) = Pr[T(x) = y] for all x G D and y £ R. We call P the 
distribution associated to T. This distribution can be specified by a \D\ by |P| transition probability 
matrix W defined by W[x,y] = P x (y). An adversary too is a transform, and so is a simulator. 

Channels. A channel is, again, just a transform. In more conventional communications terminology, 
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a channel Ch: D — >■ R has input alphabet D and output alphabet R. 

If B: D — > Z is a channel and c > 1 is an integer we define the channel B c : {0, 1} C — > Z c by B C (X) = 
B(X[1])|| • • • ||B(Jf [c]) for all X = X[l] . . . X[c] G {0, 1} C . The applications of B are all independent, 
meaning that if W is the transition probability matrix of B then the transition probability matrix W c 
of B c is defined by W[X,Y] = W[X[1],Y[1]} W[X[c],Y[c}} for all X = X[l] . . . X[c] G {0, 1} C and 
Y = Y[l] . . . Y[c] G Z c . We say that a channel Ch is binary if it equals B c for some channel B and 
some c, in which case we refer to B as the base (binary) channel and Ch as the channel induced by B. 

By BSC p : {0, 1} — > {0, 1} we denote the binary symmetric channel with crossover probability p 
(0 < p < 1/2). Its transition probability matrix W has W[x, y] = p if x ^ y and 1 — p otherwise for 
all x,y G {0, 1}. The induced channel BSC^ flips each input bit independently with probability p. 

The receiver and adversary channels of the wiretap setting will have domain {0, 1} C , where c is the 
length of the sender ciphertext, and range {0, l} d , where the output length d may differ between the 
two channels. Such channels may be binary, which is the most natural example, but our equivalences 
between security notions hold for all channels, even ones that are not binary. 

If Chi: {0, 1} C1 {0, l} dl and Ch2: {0, 1} C2 {0, l} d2 are channels then Chl||Ch2 denotes the 
channel Ch: {0, l} Cl+C2 {0, l} dl+d2 defined by Ch(xi||x 2 ) = Chl(xi)||Ch2(x 2 ) for all x x G {0, l} Cl 
and X2 G {0, 1} C2 . We say that a channel Ch: {0, 1} C — > {0, l} d is (ci, C2)-splittable if there are channels 
Chi: {0, 1} C1 -»• {0, l} dl and Ch2: {0, 1} C2 -> {0, l} d ' 2 such that Ch = Chl||Ch2. 

For any integer s we let ld s : {0, 1} S — > {0, 1} S denote the identity function defined by \d s (x) = x 
for all x G {0, 1} S . This represents a clear channel. 

We say that a channel Ch: D — >• R with transition matrix W is symmetric if the there exists a 
partition of the range as R = R\ U ■ ■ ■ U R n such that for all i the sub-matrix W[, R4] induced by the 
rows in R4 is strongly symmetric, i.e., all rows are permutations of each other, and all columns are 
permutations of each other. 

Algorithms. We sometimes describe a transform (for example an encryption function or an adver- 
sary) algorithmically. In this view, a transform T takes input X and coins R to deterministically return 
an output Y <- T(X; R). By Y <ks T(X) we mean that we pick R at random and let Y <- T(X; R). 
The probability that a particular Y is produced by this process is W[X, Y] where W is the transition 
probability matrix associated to transform T. As an example, we could specify T by saying that on 
input X and coins R, these being strings of the same length m, it returns X®R. Then Y <— $ T(X) 
means that we pick R at random and let Y = X®R, and the transition probability matrix W is a 2 m 
by 2 m matrix all of whose entries equal 2~ m . If S is a (finite) set then s ^— $ S denotes the operation 
of picking a point at random from S and denoting it s. 

3 Security metrics 

3.1 Encryption functions and schemes 

An encryption function is a transform £: {0, l} m — > {0, 1} C where m is the message length and c is the 
sender ciphertext length. The rate of £ is Rate(£) = m/c. If ChR: {0, 1} C — > {0, l} d is a receiver chan- 
nel then a decryption function for £ over ChR is a transform V: {0, l} d — > {0, l} m whose decryption 
error V~E(£;V; ChR) is defined as the maximum, over all M G {0, l} m , of Pr[£>(ChR(£(M))) / M]. 

An encryption scheme £ = {£k}keN is a family of encryption functions where £\-\ {0, l} m ( fc ) — > 
{0, l} c ( fc ) for functions m, c: N — > N called the message length and sender ciphertext lengths of the 
scheme. The rate of £ is the function Rate^: N — > M defined by Rate^(A;) = Hate(£k) for all k G N. 
Suppose ChR = {ChR,t}fc e N is a family of receiver channels where ChR fe : {0, 1} C W -»• {0, l} d ^ . Then 
a decryption scheme for £ over ChR is a family T> = {T>k}k£N where T>k- {0, l} d ^ — > {0, l} m ( fc ) is a 
decryption function for over ChR^. We say that V is a correct decryption scheme for £ relative 
to ChR if limfc^oo DE(<Sfc; Pfc; ChR^) = 0. We say that encryption scheme £ is decryptable relative to 
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ChR if there exists a correct decryption scheme for £ relative to ChR. 

This standard requirement from the I&C literature is, however, weak in that the rate at which 
the decryption error approaches zero could be very slow. For example, it would be met when 
DE(£fc; T>k; ChR^) = 1/lgfc. We propose a stronger requirement, namely that the decryption er- 
ror vanish exponentially in k. Thus we say that V is a correct strong decryption scheme for £ relative 
to ChR if there are constants d, e > such that - lg(DE(£ fe ; V k ; ChR fe )) > dk e for all k € N. We 
say that encryption scheme £ is strongly decryptable relative to ChR if there exists a correct strong 
decryption scheme for £ relative to ChR. 

We say that a family {5jt}jt G N (eg. an encryption or decryption scheme) is polynomial-time com- 
putable if there is a polynomial time computable function which on input l k (the unary representation 
of k) and x returns Sfc(x). 

3.2 Security metrics 

We are interested in measuring the security providing by an encryption function £: {0, l} m — > {0, 1} C 
relative to an adversary channel ChA: {0, 1} C — > {0, l} d . (The receiver channel is not relevant to 
security.) A security metric xs associates to £; ChA a real number Adv xs (£; ChA). Intuitively, the 
latter is the amount of "information" about the message M, as measured by metric xs, that is present 
in the adversary ciphertext ChA(£(M)). The smaller this number, the more secure is £; ChA according 
to the metric in question. The metrics we will define are semantic security (ss), distinguishing security 
(ds), mutual- information security (mis) and mutual-information security for random messages (mis-r). 

In cryptography it is customary to measure security in bits, saying that £; ChA has s bits of xs- 
security if Adv xs (£; ChA) < 2~ s . There is no formal definition of an encryption function being "secure" 
or "insecure," security rather being quantitative, given by a certain number of bits. A qualitative 
definition of "secure," meaning one under which something is secure or not secure, may only be made 
asymptotically, meaning for schemes. We say that encryption scheme £ = {£k}k<=N is XS-secure relative 
to ChA = {ChAfclfcg^ if lim^oo Adv xs (£j.; ChA^) = 0. We say £ = {£k}keN is strongly XS-secure 
relative to ChA = {ChA^}fc e N if there are constants e, d > such that — lg Adv xs (£fcj ChA&) > dk e for 
all kN. 

With encryption function £: {0, l} m — > {0, 1} C and adversary channel ChA: {0, 1} C — > {0, l} d 
fixed, we now move to defining the above-mentioned metrics. 

3.3 Mutual-information metrics 

We dub the metric in current use mutual- information security for random messages (mis-r). The 
corresponding advantage is defined via 

Adv mis - r (£;ChA) = I(U;ChA(£(U))) (1) 

where the random variable U is uniformly distributed over {0, l} m . The weakness of this metric is that 
it only considers uniformly distributed messages. Yet "real" messages are not uniformly distributed. 
Indeed, they may be drawn from small and structured spaces. They may be English text. They may 
take only values in some small and known set S. For example, they may be votes, where S = {0, 1}, 
or scores on an exam, where S = {0, 1, . . . , 100}. Mis-r security will thus not ensure security when 
encrypting the type of data that actually arises in applications. This leads us to define mutual- 
information security (mis) via 

Adv mis (£;ChA) = max I(M; ChA(£(M))) (2) 

M 

where the maximum is over all random variables M over {0, l} m . This definition is what we call 
message- distribution independent in that security is required regardless of how messages are dis- 
tributed. In this way, distributions that arise in applications are captured. 
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We remark that the importance of message-distribution independence of a metric was understood 
by Shannon [42j. His definition of perfect privacy did not assume uniformly distributed messages or, 
in fact, any particular distribution on messages, but held across all distributions. It is curious that 
this feature was dropped in defining a security metric for the wiretap channel. We have resurrected it. 

In cryptography, the importance of message-distribution independence has been understood since 
|19j and is now ubiquitously viewed as necessary for a "good" definition. We will continue to require 
message-distribution independence in our subsequent metrics. 

In the wiretap literature it is sometimes argued that the assumption of uniformly distributed 
messages is tenable because messages are compressed before transmission. But compression does not 
result in uniformly distributed messages. Compression is a deterministic, injective function and does 
not change the entropy. 

At this point we have in mis a legitimate metric, but one whose underlying intuition is, at least for 
cryptographers, obscure. Mis-security measures the difference in the number of bits required to encode 
the message before and after seeing the ciphertext. Cryptographers have very different approaches 
and intuition with regard to security and we now turn to definitions based on those. 

3.4 Semantic security 

We define the ss advantage via 

Adv ss (£;ChA) = max ( maxPr[.4(ChA(£ (M))) = /(M)] - maxPr[<S(m) = /(M)] ) . (3) 

The maximum is over all random variables M over {0, l} m and all transforms / with domain {0, l} m . 

Think of the adversary A, given the adversary ciphertext ChA(£(M)), as trying to output some 
function / of the message M. In the simplest case / is the identity function, so that the adversary is 
trying to recover the message. However, figuring out partial information about the message (rather 
than the entire message) should still be considered a violation of security. For example, the first bit 
of the message could be a vote that we want to hide, which would be captured by letting / be the 
function that returns the first bit of its input. To ensure that no partial information leaks, we allow 
/ to be any function. 

The term Pr[^4(ChA(«S(M))) = /(M)], maximized over all adversaries, represents the maximum 
possible probability that an adversary could output /(M) given ChA(£(M)). This by itself, however, 
is not a measure of its success, because knowledge of /, M entails that the adversary has some a priori 
probability of being able to output /(M) even if did not see the ciphertext. The subtracted term 
accounts for this. It is the maximum, over all simulators S, of the probability Pr[5(m) = /(M)] 
that the simulator can figure out /(M) given nothing but the length of the message and the implicit 
knowledge of /, M represented by the order of quantification. A simulator is (recall) simply a transform, 
the name chosen to allude to its role. The need to so adjust for a priori knowledge was recognized 
by Shannon [42J in the context of perfect privacy and Goldwasser and Micali in their definition of 
semantic security for public- key cryptosystems [19] . 

Finally, the outer maximum over / means that we require no partial information to leak, regardless 
of message distribution. 

This adversary-based formulation reflects the cryptographic approach and, with it, semantic se- 
curity quite directly captures a strong and natural intuition, namely that you can't compute any 
function of the encrypted message with probability better than what you could guess if you never had 
the ciphertext. Having written it this way, however, we now remark that it can be re- written using 
information-theoretic quantities. Namely, we claim that 

Adv ss (£;ChA) = sup (GP(/(M)|ChA(£(M))) — GP(/(M))) (4) 

/,m 

where the maximum is over all random variables M over {0, l} m and all transforms / with domain 
{0, \} m . The proof that Equations ([3]) and ([!]) define the same quantities is left as an exercise. 
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3.5 Distinguishing security 



We define the ds advantage via 

Adv ds (£;ChA) = max 2PrL4(M Q ,Mi, ChA(£(M b ))) = bl - 1 (5) 

A,Mo,Mi 

= max SD(ChA(£(Af ));ChA(£(Mi))) . (6) 

Mo, Mi 

In Eq. ([5]), b is a random variable uniformly distributed over {0,1}. The maximization is over 
all Mo, Mi G {0, l} m (we stress these are strings, not random variables) and all adversaries A. 
Pr[^4(Mo, Mi, ChA(£(Mb))) = b] is the probability that adversary A, given m-bit messages Mo, Mi 
and an adversary ciphertext emanating from Mb, correctly identifies the random challenge bit b. The 
a priori success probability being 1/2, the advantage is appropriately scaled. This advantage is equal 
to the statistical distance between the random variables ChA(£(M})) and ChA(£(Mi)) as per Eq. ([6]). 

In this definition, one only considers the subclass of message distributions whose support is a set 
of at most two equi-probable messages. Yet it will be equivalent to semantic security. The value of ds 
is that it is easier to work with than ss yet equivalent to it. 



4 Relations 

We establish the relations summarized in Figure [2j 



4.1 DS is equivalent to SS 

The following says that SS and DS are equivalent up to a small constant factor. The proof is an 
extension of the classical ones in computational cryptography. 

Theorem 4.1 [DS -H- SS] Let£: {0, l} m — > {0, 1} C be an encryption algorithm and ChA an adversary 
channel. Then Adv ss (£; ChA) < Adv ds (£ ; ChA) < 2 • Adv ss (£; ChA). I 

Proof: Let „4 SS be an adversary attacking the SS security of £; ChA. We construct Ads attacking the 
DS security of £; ChA as follows. On inputs Mo, Mi and adversary ciphertext C, adversary Ads runs 
A ss on input C to get a value v. If v — /(Mi) it outputs 1, else it outputs 0. Let Mo, Mi be distributed 
identically to M but independently of each other. Then 

Pr[Aj s (M , Mi, ChA(£(Mi))) = 1] = Pr[A s (ChA(£(M))) = /(M)] 
Pr[4is(Mo,Mi,ChA(£(Mo))) = 1] < maxPr[S(m) = f(M)} . 

Subtracting, we get 

Pr[As(ChA(£(M))) = /(M)] - maxPr[5(m) = /(M)] 

< Pr[Aj s (M , Mi, ChA(£(Mi))) = 1] - Pr[^ ds (M , M x , ChA(£(M ))) = 1] 
= 2Pr[Ai s (Mo, Mi, ChA(f (M b ))) = b] - 1 

< max 2Pr[Aj s (M ,Mi,ChA(£ (M b ))) = b] - 1 . 

M , Mi 

Taking the max over all adversaries and all M yields Adv ss (£; ChA) < Adv ds (£; ChA). 

We now prove Adv ds (£; ChA) < 2 • Adv ss (£; ChA). For any (wlog) distinct M ,Mi G {0, l} m let 
M Mo,Ml denote the random variable that is uniformly distributed over {Mo, Mi}. Then 

-Adv ds (£; ChA) = max max f PrL4 ds (M , Mi, ChA(£(M b ))) = b] - - 

2 M , Mi Aj s V 2 

= max max ( PrU ss (ChA(£ (M M °' Ml ))) = M M °' Ml ] - maxPr[5(m) = M M °' Ml 
M , Mi As \ S 

< Adv ss (£ ; ChA) , 
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where the max is over all distinct Mq, Mi £ {0, l} m . This simply reflects the fact that on a distribution 
over two distinct messages Mq, M\, if b is random, finding b and finding Mb are equivalent tasks. I 

A corollary of this is that an encryption scheme £ = {£k}k^N is (strongly) SS-secure relative to a 
family of channels ChA = {ChA^I/^^ if and only if it is (respectively, strongly) DS-secure relative to 
the same family of channels. We stress that Theorem 14.11 and this corollary, hold for all channels, 
meaning channels specified by arbitrary transforms. This certainly includes binary channels such as 
the one induced by the binary symmetric channel or other symmetric channels. However, it also 
includes many other channels. An (implicit) assumption, however is that successive uses of the chan- 
nel are independent, meaning if multiple messages are encrypted then the different uses of ChA are 
independent. 

The equivalence between DS and SS is helpful because DS is more analytically tractable than SS, 
and we exploit it when we come to the design of solutions. 

4.2 MIS versus DS 

Neither direction of the equivalence between MIS and DS is trivial. Going back to Eq. ([2]), H(M) 
is the minimum number of bits to encode a message from M and H(M | ChA(£(M))) is the minimum 
number of bits to encode the message after seeing the ciphertext, so mi security measures the reduc- 
tion in encoding length of messages allowed by seeing the ciphertext. It is not intuitively clear how 
this encoding length intuition relates to statistical distance or semantic security. From the analytic 
perspective, letting C = ChA(£(M)), the mutual information I(M; C) involved in Eq. ([2]) is 

- ^ Pr[M = M] lg Pr[M = M) + ^ Pr[C = C] ^ Pr[M = M|C = C] ■ lgPr[M = M\C = C] . 

M CM 

This is a complex expression that looks quite different from statistical distance, in particular because 
of the logarithms. Additionally we need to consider the maximum over M. We will first show that 
MIS implies DS and then by an entirely different technique that DS implies MIS. 

4.3 MIS implies DS 

We begin by recalling that the KL divergence is a distance metric for probability distributions P, Q 
defined by Y)(P;Q) = ^2, x P{x)\gP{x)/Q{x). Let M,C be random variables. Probability distribu- 
tions J m ,C,Im,C are defined for all M,C by J M)C (M,C) = Pr [M = M,C = C] and I MjC {M,C) = 
Pr [ M = M] ■ Pr [ C = C\. Thus Jm,c is the joint distribution of M and C, while Iu,c is the "indepen- 
dent" or product distribution. The following lemma recasts mutual information in terms of the KL 
divergence between the joint and independent distributions. 

Lemma 4.2 Let M,C be random variables. Then I(M;C) = D( Jm,c; Im,c)- I 

The proof is standard and recalled for completeness in Appendix lB.il We have taken this path in the 
hope of exploiting Pinsker's inequality — from |38] with the tight constant from [12] — which lower 
bounds the KL divergence between two distributions in terms of their statistical distance: 

Lemma 4.3 [Pinsker's Inequality] Let P,Q be probability distributions. Then T)(P;Q) > 2 • 
SD(P;Q) 2 .| 

At this point, from Lemmas 14.21 and 14.31 letting C = ChA(£(M)) denote the adversary ciphertext, we 
have 

Adv mis (£;ChA) = max D(J m , c /m,c) > 2 ■ max SD(J MiC ;%) 2 . (7) 

M M 

We would now like to connect the RHS of Eq. (0) to Adv ds (£; ChA) so as to lower bound the mis 
advantage in terms of the ds advantage. (The upper bound will need a completely different approach, 
Pinsker's inequality being of no use for it.) It turns out that one can show that 

max SD(J M)C ;/ M ,c) < Adv ds (£; ChA) . (8) 



12 



This, inequality, unfortunately, goes the wrong way, in the sense that, combining it with Eq. ([7]) does 
not lower bound mis in terms of ds. The observation that gets around this is that the inequality of 
Eq. (|8|) becomes an equality when one restricts attention to M distributed uniformly over a set of two 
messages. More precisely: 

Lemma 4.4 Let M , Mi G {0, l} m and let M be uniformly distributed over {M , Mi}. Let g: {0, l} m — > 
{0, 1} C be a transform and let C = g(M). Then 

SD(J m ,c;/m,c) = i-SD( 5 (M ); 5 (Mi)) . I 

Proof: We have 

SD(Jm,c; Im,c) = X E |^M,c(w,c) - L My c(m,c)\ 

m,c 

= \ E \ P M( m ) ■ Pc\M=m(c) ~ Pu{m) • Pc(c)| 



1 

~ E PmH E |P C |M=m(c) " P C (c) 
m c 

^e^me 

m c 

JE^mE 



C|M=ml c J 



C\M=m' 



(c) • P M (m') 



E( Pc l M = m ( C ) ~ - P C|M=m'( c )) • Pu{m') 
III' 

So far we have not used the assumption that M takes only two values, each with probability 1/2. Let 
a(mo) = mi and a(mi) = m^. Then the sum over m' above has only one candidate non-zero term, 
namely the one corresponding to m! = a(m). So the above equals 

^E PM ( m )E |( P C|M= m (c) - Pc\M=a(m)(c)) ' P M (a{m))\ 
m c 

= 2 2 ^ 2 l Pc l M=M o( c ) ~ P( 

m c 

= r 2 -rEl p ciM=A/ (c)-Pr 

l) 



CiM=Mi( c )| 
' C| M=i\/i ( c ) | 



1 



- • SD(P C | M=Mq ; Pc|M=Afi; 

^■Sn(g(M );g(Mi)) . 



This completes the proof. I 



Now we combine the lemmas to prove that MIS implies DS: 

Theorem 4.5 [MIS — > DS] Let E: {0, l} m — > {0, 1} C be an encryption algorithm and ChA an adver- 
sary channel. Then Adv ds (£; ChA) < ^2 • Adv mis (£; ChA). I 

Proof: We have already applied Lemmas g21 and S3] to get Eq. 0. Now let M ,Mi G {0, l} m 
be messages for which SD(ChA(£(M )); ChA(£(Mi))) equals Adv ds (£; ChA). Let M be uniformly 
distributed over {Mq, Mi}. Since M is in the scope of the max in ([7]), we can apply Lemma 14.41 to see 
that the RHS of Eq. (J7J is at least 



2 • SD( J M C ; /m,c 
where again C = ChA(£(M)). | 



2 = 2-i-SD(ChA(£(M ));ChA(£(Mi)) 2 = - • Adv ds (£; ChA) 2 , 



13 



A corollary of Theorem 14.51 is that if an encryption scheme £ = {£k}k<Si is (strongly) MIS-secure 
relative to a family of channels ChA = {ChAfc}fc S N then it is also (respectively, strongly) DS-secure 
relative to the same family of channels. Again, Theorem 14.51 an d this corollary, hold for all channels, 
not just binary ones. On the quantitative side, however, Theorem 14.51 says that s bits of mis security 
imply (s — l)/2 ~ s/2 bits of ds security. We do not know whether this bound is tight. 

4.4 DS implies MIS 

The proof is underlain by the following general lemma that bounds the difference in entropy between 
two distributions in terms of their statistical distance: 

Lemma 4.6 Let P,Q be probability distributions. Let N = |supp(P) U SUPp(<5)| and e = SD(P;Q). 
Then H(P) - H(Q) < 2e • lg(JV/e). I 

To prove this we need the following, which appears as Equation (16.24) in Section 16.3]: 
Lemma 4.7 Let p, x 6 [0, 1/2] and assume p + x < 1/2. Then \h(p + x) — h(p)\ < h{x). I 
The following is similar to the proof of Theorem 16.3.2]. 
Proof of Lemma \4M Let 5{y) = \P(y)/2 - Q(y)/2\ < 1/2 for all y. Then 

U(P)-U(Q) = ^2h(P(y))-h(Q(y)) = £ P(y) lg - Q(y) lg 

= Vpmik — Q(v)\R 1 = ^Mlg 1 

Y {Vng P{y)/2 Q[y)[g Q(y)/2 \ 2 [g P(y)/2 2 ig Q(y)/2 

= 2-Y,KP{y)/2)-h{Q{y)/2) < 2 ■ £ \h{P(y)/2) - h{Q(y)/2)\ . 
y y 

If P(y) > Q(y) then P(y)/2 + 5(y) = Q(y)/2 < 1/2 so we can apply Lemma S7T] to get 

\h(P{y)/2)-h{Q(y)/2)\ = \h(P(y)/2) - h(P(y)/2 + 5(y))\ < h(5(y)) . 

On the other hand if Q(y) > P{y) then Q(y)/2 + 5(y) = P(y)/2 < 1/2 so we can apply Lemma UTTl to 
get 

\h(P(y)/2)-h(Q(y)/2)\ = \h(Q(y)/2 + 8(y)) - h(Q(y)/2)\ < h(5(y)) . 
Continuing the above we have 



H(P)-H(Q)<2-^>(%)) = 2-^-%)lg%) 



= 2e .^-^ig%) 1 2 e .^-^ig^-M lge 

y y 

= 2e • ^ H5(y)/e) - 2e ■ ]T 6 M lg e = 2e ■ £ - 2 lg e • £ 8(y) 

y y y y 

= 2e- ^ h(5(y) / e) -2elge < 2e • lg A^ - 2e lg e 

e 

as claimed. | 

To exploit this, we define the pairwise statistical distance between random variables M, C via 

PSD(M;C) = max SD(P C | M=Mo ;P C | M=Ml ) , 

M , Mi 

where the maximum is over all Mq, Mi G supp(Pm)- We now have: 
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Lemma 4.8 Let M,C be random variables. Then SD(Pc; Pqm=m) — PSD(M;C) for any M. 
Proof: We have 



2 X ! P C|M=m(c) - Pc(c)| = ^ Yl Pc l M 



=m 



( c ) - X P C|M=m'( c ) • Pu{m') 



c c 



2 X X( Pc l M=m ( C ) ~ P C|M=m'(c)) • Pm("^') 



c m' 



< - 




c m' 



2 Yl P M( m ') X ! P C|M=m(c) - P C |M=m'(c)| 



< 



^P M (m') -PSD(M;C) 



PSD(M;C) 



as claimed. I 



We now show that DS implies MIS. 



Theorem 4.9 [DS — > MIS] Let £: {0, l} m — > {0, 1} C be an encryption algorithm and ChA an adver- 
sary channel. Let e = Adv ds (£; ChA). Then Adv mis (£; ChA) < 2e • lg(2 c /e). I 

Proof: Let f{x) = min(2x lg(2 c /x), 1). Let C = ChA(£(M)) be the adversary ciphertext. Then 
I(M; C) = I(C; M) = H(P C ) - £ M P M (M) ■ H(P C |M=m) = Ea/Pm(M) • (H(P C ) - H(P C |M=m)) • 



Let y = PSD(M;C). Lemma 14.81 savs x < y. The function / has the property that x < y implies 
f(x) < f(y)- So the above is at most f(y). Finally, take the maximum over all M on both sides. I 

We would like as a corollary of Theorem 14.91 to say that if an encryption scheme £ = {ffcjfcgN is 
(strongly) DS-secure relative to a family of channels ChA = {ChA^I/ug^ then it is also (respectively, 
strongly) MIS-secure relative to the same family of channels. Theorem 14.91 does not imply this for 
all encryption schemes but it does usually. Let m, c be the message length and sender ciphertext 
length off. Let e(k) = Adv ds (£ fc ; ChA fc ) and 6(k) = 2e(k) lg(2 c ^ / e(k)) . Usually m, c are bounded by 
polynomials in k while e(k) decreases exponentially in k. In this case, we have the desired conclusions. 
We do under other conditions as well. The case where we would not is the unnatural one that c{k) is 
extremely fast growing, for example exponential in k, yet e(k) decreases no faster than exponentially. 

We do not know whether the bound of Theorem 14.91 is tight. The following, however, says that 
the bound of Lemma 14.61 is tight up to a constant factor. This means that improving the bound of 
Theorem 14.91 would require a different approach. 



Proposition 4.10 Let n > k > 1 be integers. Let e = 2 k and N = 1 + e2 n . Then there are 
distributions P,Q with |supp(P) U SUPp(Q)| = N and SD(P;Q) = e and H(P) - H(Q) > 0.5 • e • 
lg(JV/e). I 

Proof: Let S = {0, 1, . . . , N- 1}. Let P,Q: S ->• [0, 1] be defined as follows. Let P(i) = 2~ n for i > 1 
and P(0) = 1 - e. Let Q{i) = for i > 1 and Q(0) = 1. Then 




SD(P;Q) = l.( e + (JV_l).2-») = l -.(e + e2 n -2 



—n 



= e . 
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On the other hand 

H(P) - H(Q) = (e2 n • /i(2~ n ) + - e)) - h(l) = en + - e) > en . 

However 

e iV e l + e2 n e , 2e2 n e , 
-•lg T = j'fe— T" ^ 2' lg ^ = 2' (n + 1) " ^ 
which proves the claim. | 



4.5 MIS-R does not imply DS 

At this point we have justified all the numbered implication arrows in Figure [2j The un-numbered 
implication MIS — > MIS-R is trivial. Let us turn to the separation MIS-R MIS. To justify it 
we need to exhibit an encryption scheme £ and a family ChA of adversary channels such that £ is 
(strongly) MIS-R-secure relative to ChA but not (respectively, strongly) DS-secure relative to ChA. 

This is easy to do with a contrived choice of ChA. (Have the channel faithfully transmit inputs m 
and l m and be very noisy on other inputs. Then MIS fails because the adversary has high advantage 
when the message takes on only values m , l m but MIS-R-security holds since these messages are 
unlikely.) This however is not very convincing because of the obscure nature of the channel. We 
will instead give an example where the channel is the binary symmetric one. The counter-example is 
constructed by starting with a scheme that is MIS-R-secure (if none exists the separation question is 
moot so we make the minimal assumption such a scheme exists) and then modifying it to a scheme 
that retains MIS-R security but is not DS-secure. 

Proposition 4.11 [MIS-R -fa DS] Suppose < p < 1/2. Let £ = {££,}fceN be an encryption 
scheme with message length m satisfying m(k) > k for all k £ N and with sender ciphertext length d . 
Assume it is (strongly) MIS-R-secure relative to {BSCp }fegN- Then there is an encryption scheme 
£ = {£fc}fceN with message length m and sender ciphertext length c that is (respectively, strongly) MIS- 
R-secure relative to {BSCp k ^}ken but not (respectively, strongly) DS-secure relative to {BSCp'}ken- 
Furthermore, if £ is polynomial-time computable so is £ and if £ is (strongly) decryptable relative to 
{BSCp }feeN then £ is (respectively, strongly) decryptable relative to {BSCp }feeN- I 

The final conditions regarding computability and decrypt ability are to ensure that we don't "cheat" 
by making £ different from £ in some categorical way. 

Proof of Proposition 14. IT} Let n be an integer satisfying e - n (°- 5 -p) 2 / 2 < 1/4. Let c(k) = n + d{k). 
Let £k(M) be defined by 

If (M = m ( fc ) OR M = l m W) then a <- Af [1] else a «-« {0, 1} 

C'^$£' k (M) 

Return a n \\C 

Recall b n denotes n copies of the bit b. The choice of n ensures that the probability that BSCp(a n ) 
has at least n/2 positions equal to a is at least 3/4 for any a G {0, 1}. (Thus, a n is a repetition-code 
based encoding of a.) This allows a ds attack with messages M = m{k) and M x = l m W so that 
Adv ds (£fc; BSCp^) > 1/2. This implies £ is certainly not DS-secure, let alone strongly DS-secure, 
regardless of the security of £ . 

Now we want to show that £ retains the assumed MIS-R security of £ . The reason is that, if the 
message is random, then the assumption m(k) > k implies that it is very unlikely to be either m ^ 
or 

1 m(fc)_ But if M ig nQt 

one of these messages then £k{M) provides no more information about M 
than £' k (M). We can thus bound the increase in mis-r-advantage. 
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Finally, polynomial-time computability is preserved by construction and decryptability because de- 
cryption in the new scheme can be done by ignoring the first n bits of the receiver ciphertext and 
decrypting the remainder as per the old scheme. I 

4.6 Settings in which MIS-R implies MIS 

Above we saw that in general MIS-R does not imply MIS. Here we show that for certain types of 
encryption schemes and channels, MIS-R does imply MIS. This is exploited in |29} I28j. 

Throughout this section, it is convenient to think of any randomized encryption function £ : 
{0, l} m — > {0, 1} C as a (deterministic) function {0, l} r x {0, l} m — > {0, 1} C , where the first argument 
takes the role of the random coins. We call £ separable if 

£(R,M) = £(R,0 m )®£(0 r ,M) (9) 

for all R G {0, l} r and M G {0, l} m . Also, £ is message linear if the map £{0 r , •) : {0, l} m -> {0, 1} C 
is linear, i.e., 

£(0 r ,M + M') = £(0 r ,M)+£(0 r ,M') (10) 

for all M,M' G {0,l} m . 

The following theorem states that for an encryption function which is message linear and sepa- 
rable, MIS-R security implies MIS security when the adversarial channel is symmetric and operates 
independently on ciphertext bits. 

Theorem 4.12 [MIS-R — > MIS] Let £ : {0, l} m — > {0, 1} C be a separable and message-linear en- 
cryption function, and let ChA : {0, 1} — > {0, 1}^ be a symmetric channel. Then, 

Adv mis (£;ChA c ) < Adv mis - r (£; ChA c ) . I 

Before we turn to the the proof of Theorem 14.121 we first note that the combination of the 
encryption function £ : {0, l} m — > {0, 1} C and a channel ChA c implicitly yields a new channel 
Ch £ : {0, l} m -> {0, l} £ c which, on input M G {0, l} m outputs ChA c (£(M)). Its output corre- 
sponds to the adversarial view. The proof of Theorem 14.121 relies on the channel Ch^ being symmetric. 
In particular, we need the following characterization of symmetric channels. 

Lemma 4.13 Let Ch : {0, l} m — > {0, 1}^ be a channel. Assume that there exists a family {tmImgIo.i}" 1 
of functions tm ■ {0, 1}^ — > {0, 1}^ such that r^m is the identity on {0, 1} , and moreover, for all 
M, M' e {0, l} m , and Y £ {0, l} e , 

t m ®m>(Y) = t m (t m >(Y)) , W[M M', Y] = W[M, t m ,{Y)} . (11) 

Then, Ch is symmetric. I 

Proof of Lemma 14. 13b For every fixed M G {0, l} m , we observe that tm is self-inverse, since 
tm{t~m{Y)) = tm®m(Y) = r o m (^) = Y, and hence is a permutation on {0, 1} £ . Define the relation ~ 
such that Y ~ Y' if and only if there exists M £ {0, l} m such that tm(Y) = Y' . It is easy to verify 
that ~ is an equivalence relation. 

Let us now partition the columns of the transition matrix W so that two columns corresponding 
to outputs Y, Y' G {0, 1}^ are in the same sub-matrix if and only if Y ~ Y'. The induced sub- 
matrices are strongly symmetric: On the one hand, for any two fixed M,M' G {0, l} m , W[M,Y] = 
W[M', tm®m'(Y)] for all Y G {0,1}^, and clearly tm&m'(Y) ~ Y. On the other hand, for any fixed 
Y ~ Y' there exists M* G {0, 1}> such that Y' = t m *{Y) and thus W[M, Y] = W[M M*,Y'] for all 

Me {o,i} m . I 

Proof of Theorem I4.12t We now provide a family of functions {7M}Me{o,i} m as required in 
Lemma E3 for the channel Ch £ : {0, l} m ->■ {0,1}^. As ChA : {0,1} ->■ {0, 1} £ is symmetric, its 
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transition matrix can be decomposed into sub-matrices with the property that, for each sub-matrix, 
there exist values p,q G [0, 1] such that all columns are of the form [p, q] T or [q, p] T , and there is an equal 
number of columns of each of the two types. Hence, there exists a permutation m : {0, l} e — > {0, 1} 
such that Pr[ChA(l) = y] = Pr[ChA(0) =7Ti(y)], and, moreover, ir^ = n\. Therefore, with ttq 
being the identity, we have Pr [ChA(6) = y] = Pr [ChA(O) = vrft(y)]. Also, it is easy to verify that 
^b^b' = T^b Kb 1 for all b,b' G {0,1}. In the same way, when considering c-bit inputs, we define a 
permutation -kx '■ {0, 1}^' C — > {0, l} i c such that 

Kx(Y) = (ir x[1] (Y[l]),...,n x[c] (Y[c})) 

for all X G {0, 1} C . Then, 

C 

Pr [ ChA c (X) = Y } = Yl Pr [ ChA(X[i]) = ] 

i=i 

c 

= []Pr [ChA(0) = 7r xW (Y[i])] = Pr [ChA c (0 c ) = tt x (Y)] 
i=i 

for all X G {0, 1} C and F G {0, 1} £ ' C . Consequently, vr^eX' =^x° vr X ' for all X, X' G {0, 1} C . We now 
define, for all M G {0, l} m and F G {0, l} £ c , 

TwOO = 7T£(or ( M)00 • 

In addition to Tom being the identity (since £(0 r , m ) = C ), using message linearity we verify that, for 
all M,M' G {0, l} m , 

T~M@M' = 7r£(0 r ,MffiAf') = 7Tf(0'',M)e£(0 r ,M') = ^e{O r ,M) ^£(0 r ,M') = T M ° TM' • 
To conclude, with R being the r-bit randomness used by the encryption £, 

Pr[Ch £ (MeM') = y] = ^ Pr[R = i?] • Pr [ChA c (£(R, M ® M')) = Y] 

Ke{o,i} r 

= £ Pr[R = J R]-Pr[ChA c (0 c )=vr^ iMeM , ) (y)] 

Re{o,i} r 

= Pr[R = i?]-Pr[ChA c (0 c )=7r £(H ,M)e^,M0(^)] 

Re{o,i} r 

= Pr[R = J R].Pr[ChA c (0 c )=vr^ iM) (7r £(0 , iM , ) (y))] 

Re{o,i} r 

= £ Pr[R = J R]-Pr[ChA c (£:( J R,Af)) = 7r £(0 , iM , ) (y)] 

i?e{o,i} r 
= Pr[Ch £ (M) =T M >(Y)] . 

Therefore, Ch^ is symmetric by Lemma 14.131 

To conclude the proof, we observe that for every symmetric channel Ch with m-bit inputs and every 
m-bit random variable M, 

I(M;Ch(M)) < I(U;Ch(U)) 

where U is a uniformly distributed m-bit string (see e.g. |11} Theorem 7.2.1] for a proof). This implies 
the theorem statement in the case where Ch = Ch^. | 

Extensions. We do not know how to extend the above result to arbitrary symmetric channels with 
multi-bit inputs, yet there are several natural channels ChA which for which MIS-R-security implies 
MIS-security. 

An (important) example is the channel ChA : {0, 1} C — > {0, 1} C which, on input C G {0, 1} C , 
samples a fresh noise string E G {0, 1} C according to some given noise distribution, and finally outputs 
C®E. Provided £ : {0, l} r x {0, l} m —> {0, 1} C is separable and message linear, the combined channel 
Ch£ is proven to be symmetric by defining tm such that tjw(Y) = £(0 r , M) © Y for all M G {0, l} m 
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and Y G {0,1} C . 



5 Achieving DS security 

The broad question with regard to achieving security is the following. Let us fix a metric xs, a family 
ChR of receiver channels and a family ChA of adversary channels. Is there an encryption scheme that 
is decryptable relative to ChR while achieving XS-security relative to ChA? 

This question has been examined in the I&C community when XS=MIS-R. The question of interest 
has been to achieve (and determine) the optimal rate. Results tend to be non-constructive, proving 
the existence of schemes but not providing explicit schemes, let alone ones that have polynomial time 
encryption and decryption. 

By introducing more demanding security metrics, we have upped the ante. We ask about the 
achievability of DS (equivalently, SS, MIS). With a practical perspective, we seek not mere existence 
results but explicit schemes with polynomial-time encryption and decryption. In this section we 
present such schemes. 

The rate of our schemes, although reasonable, is short of optimal. Subsequent work has tackled 
the fundamental question of determining and achieving the optimal rate, showing, for a wide class 
of receiver and adversary channels, that the optimal rate for DS security equals the MIS-R secrecy 
capacity (meaning, the optimal rate for MIS-R security), and presenting schemes that achieve this 
while having polynomial-time encryption and decryption [3]. 

The methods we use in this section are based on extractors. (More precisely, what in the cryp- 
tographic literature are called strong randomness extractors.) For concreteness we use extractors 
defined via universal hash functions and their analysis via the Leftover Hash Lemma [20J and its 
generalizations [161 02] • 

Direct use of extractors would, however provide security without error correction, yielding DS- 
secure schemes when the receiver channel is the clear one. Adding error-correction is not as simple 
as putting an ECC on top of an encryption that is secure when the receiver channel is the clear one 
because the ECC helps the adversary. Our approach here is to reduce DS-security to a weaker security 
requirement that we call rs-r security on an ECC. This is an adaption of the ideas of secure sketches 
and fuzzy extractors from |16|. [15] . 

5.1 Hash functions as extractors 

A hash function is simply a two-argument function Ti: {0, l} h x {0, 1}" — > {0, l} m . To every "name" 
or description h G {0, l} h , we associate the function T~Lh = 1~L{H,-): {0,1}" — > {0, l}" 1 , meaning 
Uh{U) = H(H, U) for all U G {0, l} n . We say that H is universal if for all distinct U x , XJ 2 £ {0, 1} M 
we have 

Pr[U(H,U 1 )=H(H,U 2 )) < 2~ m , 

where the probability is (only) over H <—$ {0, l} h . We now give some concrete constructions. 

The matrix-based construction H: {0, l} h x {0, 1} U — > {0, l} m has h = um and views a description 
H £ {0, l} h as specifying ambyu matrix over GF2. It then lets T-L(H, U) = HU, where U is viewed 
as a u by 1 matrix over GF2 and HU is matrix- vector multiplication, returning a 1 by m matrix over 
GF2 that is regarded as an m-bit string. 

When m < u, a construction reducing the description length h from urn to just u can be obtained 
as follows. Identify {0,1}^ with GF 2 <? for any i. Fix a regular projection 7r: GF2« — > Gf-2 m , meaning 
that every y £ GF2™ has exactly 2 u ~ m pre- images under tt. The description H of a function is a point 
H G GF2", and we let H(H, U) = tt(HU). Here the multiplication is in GF2«. These are all standard 
constructions whose universality is easily checked [43, Chapter 8]. 

The following generalization of the Leftover Hash Lemma of [20] was stated in [161 PT5] . 
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Transform £{M) 

#^${0,1}^ ■ [/ <-« {0, 1} U 
P^H(H,U) ; W ^P®M 
X <r- Enl (Z7) || En2(£T|| W) 



Transform T>(Y) 
Yi\\Y 2 <- C 

U <- Enl _1 (yi) ; <- En2" 1 (y 2 ) 

P^H(H,U); M <- PBW 



Return X 



Return M 



Figure 3: On the left is the encryption function £ = XtX[%, Enl, En2]: {0, l} m -> {0, l} ni+n2 as- 
sociated to hash function U: {0, l} h x {0,1}" -> {0, l} m and ECCs Enl: {0,1}" -»• {0, l}™ 1 and 
En2: {0, l} fi -+ m — >• {0, l} n2 by the XtX construction. On the right is the decryption function for a 
channel ChRl || ChR2 for which Enl, En2 are, respectively, ECCs. 



Lemma 5.1 [Generalized Leftover Hash Lemma] Let H: {0, l} h x {0, 1}" — > {0, l} m be a uni- 
versal hash function. Let U be a random variable over {0,1}". Let random variable H be uniformly 
distributed over {0, l} h . Let random variable V be uniformly distributed over {0, l} m . Let Z be a 
random variable. Assume (U,Z),H,V are independent. Then 



Above, Z may depend on U but the three random variables (U, Z), H, V must be independent. 

5.2 The XtX construction 

Let ft: {0,l}^x {0,1}" -»• {0, l} m be a hash function. Let Enl: {0, 1}" -> {0, l} ni and En2: {0, l} h + m -». 
{0, l}™ 2 be injective functions. These will later be instantiated by (the encoding functions of) ECCs. 
The XtX (extract then xor) construction associates to ft, Enl, En2 the encryption function £ = 
XtX[ft, Enl, En2]: {0, l} m ->■ {0, l}"i+ n 2 defined via Figure O The encryption function picks at ran- 
dom a /i-bit string H to specify hash function 7i(H, •) as well as a random r-bit string U, and hashes 
[/ under H(H, •) to obtain a m-bit pad P. The latter is xored to the message M G {0, l} m to get W. 
The sender ciphertext X includes W, but also includes H, X to enable decryption. 

5.3 Overview 

We will analyze XtX in a general setting. The only assumption made about the channel (whether 
receiver or adversary) Ch: {0, l} ni+n2 — > {0, l} d is that it is (n\, rt2)-splittable, meaning the applica- 
tions of the channel on Enl({7) and on En2(if||Vl / ) are independent. The canonical setting is that both 
the receiver and adversary channel are induced by binary channels, in which case they are certainly 
(ni, ri2)-sphttable. The binary channels here may be symmetric but need not be. Within this general 
setting, we look at decrypt ability and DS-security and establish the following: 

• Say the receiver channel splits as ChR = ChRl||ChR2. If Enl is a good ECC for ChRl and En2 
is a good ECC for ChR2 then XtX[H, Enl, En2] is decryptable. (By "good" here we mean that 
decoding is possible with low error.) 

• Say the adversary channel splits as ChA = ChAl||ChA2. DS-security of XtX [7^, Enl, En2] relative 
to ChA makes no assumptions about ChR2 or En2, meaning holds for all choices of these. The 
requirements will be that T~L is universal and that Enl, viewed as an encryption function, provides 
a certain weak security property, called rs-r (recovery security for random messages), relative to 
ChAl. This asks that if U is uniformly distributed then it is hard to recover it from ChAl(Enl(U)). 
This notion and approach are inspired by the secure sketches of |16| H5] . 

Now, given a particular choice of receiver and adversary channels, we would pick Enl, En2 to give 
us the error-correction and security properties needed. To exemplify we will consider the case where 
ChR, ChA are induced by binary channels and in particular the binary symmetric channel. For these 



SD((H,Z,7£(H,U));(H,Z,V)) < ■ GP(U|Z) . I 
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cases we will provide bounds on the rs-r advantage of arbitrary ECCs. With this in hand, concrete 
solutions emerge for these channels. 

Both encryption and decryption will be polynomial-time assuming the codes have polynomial-time 
encoding and decoding. The rate of XtX[%, Enl, En2] depends on the choice of codes but even with 
optimal choices is not itself optimal. It turns out that H, once chosen and transmitted, can be re-used 
across multiple encryptions, playing the role of a seed in what in [3] is called a seeded-encryption 
scheme, so, via amortization, it can be ignored from the point of view of rate. This increases the rate 
but still leaves it short of optimal. We are not aware of any simple way to fill gap. A scheme with 
optimal rate is given in [3] using alternative techniques. 

5.4 Decryptability of XtX 

Consider any receiver channel ChR that has the form ChR = ChRl || ChR2 where ChRl: {0, l}™ 1 — > 
{0, l} dl and ChR2: {0, l}™ 2 {0,1}*. If Enl,En2 are, respectively, good ECCs over ChRl,ChR2, 
then XtX[%, Enl, En2] will be decryptable over ChR. Decryption is performed as shown in Figure EJ 
The first step parses receiver ciphertext Y into its first d\ bits Y\ and last di bits Y2. Decryption 
(ie. decoding) algorithms for the ECCs are then applied, and the outputs again appropriately parsed. 

Theorem 5.2 [Decryptability of XtX] Let U: {0,1}^ x {0, 1}" -)• {0, l} m be a hash function. 
Let Enl -1 : {0, l} dl — > {0, l} r be a decryption function for Enl: {0,1}" — > {0, l} ni over channel 
ChRl: {0, l}™ 1 {0, l} dl with decryption error 81. Let En2~ 1 : {0,1}* -> {0, \} k + m be a decryption 
function for En2: {0, l} h+m {0, l}™ 2 over channel ChR2: {0, l}™ 2 — > {0, 1}* with decryption error 
5 2 . Let £ = XtX[%, Enl, En2]. Then V: {0, 1}*+* {0, l} m as depicted in Figure\^is a decryption 
function for £ with decryption error at most 81+82- I 

5.5 DS-security of XtX 

Regard Enl: {0,1}" — > {0, l} ni as an encryption function. We define recovery security for random 
messages (rs-r) relative to channel ChAl: {0, l}" -1 — > {0, l} dl via the advantage 

Adv rs - r (Enl;ChAl) = GP(U|ChAl(Enl(U))) = maxPr[^(ChAl(Enl(U))) = U] 

where random variable U is uniformly distributed over {0, 1}" and the maximum is over all adversaries 
A. As this shows the definition can either be expressed in terms of information-theoretic quantities (the 
guessing probability) or using adversaries, and we will have occasion to exploit both representations. 

Theorem 5.3 [DS-security of XtX] Let U : {0, l} h x {0, 1}" -)• {0, l} m be a universal hash func- 
tion. Let Enl: {0, 1}" -> {0, l}™ 1 and En2: {0, \} h + m -> {0, l}™ 2 be injective functions. Let £ = 
XtX[-H,Enl,En2]. Let ChAl: {0, l}" 1 {0, l} dl and ChA2: {0, l}" 2 {0, l}™ 2 be channels. Let 
ChA = ChAl||ChA2. Then 

Adv ds (£;ChA) < ^2 m ■ Adv rs " r (Enl; ChAl) . I 

The bound does not depend on ChA2, En2, reflecting that security is independent of the choice of 
these. The proof will use the following standard lemma. 

Lemma 5.4 LetX\,X2 be random variables and let fi,f 2 be transforms. Then SD(/i(Xi); /2(X2)) < 
SD(X ; :X,j. I 

Given this we proceed to prove Theorem 15.31 

Proof of Theorem 15. 3t Let Mo, Mi G {0, l} m . Let U,H,V be uniformly and independently 
distributed random variables, the first over {0, 1}", the second over {0, l} h and the third over {0, l} m . 
Let Z = ChAl(Enl(U)). For c G {0, 1} let W c = ChA2(En2(H||^(H, U)0M c )). Let 

<5(M ,Mi) = SD((Z,W );(Z,Wi)) 

e(Mo,M!) = SD((H,Z,7£(H,U)©Af );(H,Z,W(H,U)eMi)). 
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Lemma 15.41 implies that regardless of the choices of En2 and ChA2 we have 

6 (Mo, M{) < e(M ,Mi) . 

We will now show that 

e(M ,Mi) < ^2 m - Adv rs - r (Enl;ChAl) . (12) 
Since Mo, Mi were arbitrary, this proves the theorem. 
Towards proving (fT2|) . let 

ei(M ,Mi) = SD((H,Z,^(H,U)©M );(H,Z,V©M )) 
e 2 (M ,Mi) = SD((H > Z > V©Mo),(H,Z,V©Mi)) 
e 3 (M ,Mi) = SD((H,Z,VffiMi),(H,Z,«(H,U)eAfi)). 

By the triangle inequality we have 

e(M ,Mi) < ei(M Q ,Mi) + e 2 (M ,M 1 ) + e 3 (M ,M 1 ) . 
However, £2(Mo, Mi) = because V is uniformly distributed over {0, l} m and Mq, Mi G {0, l} m . Also, 

ei(M ,Mi) = e 3 (M ,Mi) = SD((H, Z,H(H, U)), (H,Z, V)) . 
We conclude by applying Lemma 15. 11 1 



5.6 XtX over a BSC 

We can apply the above general framework and results to provide concrete DS-secure encryption 
schemes for particular channels. We illustrate here for the case of the binary symmetric channel. We 
begin for simplicity with the case that the receiver channel is error-free and then move on to the case 
where it is not. 

Assume the receiver channel is error-free. In this case we can set Enl = ld u and En2 = \df l + m 
to identity functions. Rs-r advantages are now easy to bound. In the case of the binary symmetric 
channel we have the following: 

Proposition 5.5 Suppose < p < 1/2 and let ChA = BSQ. Then 

Adv rs - r (ld u ;ChA) = (l-p) u . I (13) 

Proof: Letting random variable U be uniformly distributed over {0, l} n we have 

GP(U|BSC£(U)) = ^Pr[BSC£(U) = z]-maxPr[U = x|BSq(U) = z] 

z 

= ^Pr[BSC£(U) = z] ■ Pr[U = a|BSC£(U) = z] 

z 

= ^Pr[BSq(U) = z]-(l- P r 

z 

= ^2-«.(1-p) u 

= d-p) u , 

establishing (fT3j) . | 

By combining Theorem 15.31 and Proposition 15.51 we get the following concrete bound for DS-security 
of XtX encryption in the case the receiver channel is error-free and the adversary channel is a binary 
symmetric one: 

Theorem 5.6 LetU: {0, l} h x {0, 1}" -> {0, l} m be a universal hash function, let £ = XtX[n, \d u , 
\d h+m \. Suppose < p < 1/2 and let ChA = BSC£ +h+m . 

Adv ds (£;ChA) < ^2 m • (1 - p) u . I 
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Now let us consider the case where the receiver channel is not error-free. Enl, En2 would now be 
chosen to correct errors over the receiver channel. Taking them as given, we need to bound the rs-r 
advantage of the former. For the case where the adversary channel is a binary symmetric one we have 
the following: 

Proposition 5.7 Let Enl: {0, 1}" — > {0, l}"+ r be an injective function. Suppose < p < 1/2 and let 
ChA = BSC£ +r . Then 

Adv rs " r (Enl;ChA) < 2 r {l-p) u+r . I (14) 
Proof: Let Z = ChA(Enl(U) where random variable U is uniformly distributed over {0, 1} U . Then we 

have 

GP(U|Z) 



< 



which proves the Proposition. | 

From Theorem 15.31 and the above we directly obtain the following: 

Theorem 5.8 LetU : {0, l} h x {0, \} u -> {0, l} m be a universal hash function. Let Enl: {0, 1}" -t 
{0, l}" +r and En2: {0, l} h+m -> {0, l}™ 2 be injective functions. Let £ = XtX[7£, Enl, En2]. Suppose 
< p < 1/2 and let ChA = BSC£ +r+n2 . Then 

Adv ds (£;ChA) < ^2 m + r ■ (1 - p) u + r . I 



EPr [Z = z] • max Pr [ U = x I Z = z ] 
X 

z 

V Pr[Z = z] -maxPrfZ = z \ U = x] = *} 
^ 1 1 x Pr Z = z 

2 L J 

V max Pr [ Z = z I U = x ] ■ 2" u 

' X 

z 

^ 2 -(l_ p) «H 

2*(1 - p) u+r 



5.7 Numerical estimates 

In usage, we would imagine the user or system designer wanting to create a scheme that encrypts 
messages of a certain desired length m and achieves a certain desired number of bits s of DS-security 
under the assumption that the adversary channel is a BSC with a certain, given crossover probability 
p. Our job, given m,s,p, is to provide the scheme. The above allows us to do so. To illustrate, let 
us for simplicity consider the case where the receiver channel is error-free, so that £ = XtX[%, ld u , 
Idfe+m]. Our task amounts to picking H so that Adv ds (£; BSC^) < 2~ s . Theorem 15.61 says that it 
suffices to let 



2s + m 



where a = le 



1 — p 

We can then use one of the constructions of H: {0, l} h x {0, 1} U -> {0, l} m given above. The amortized 
rate of the scheme is 

„ ._. m am 
Rate(f) = = 

u + m 2s + m + am 

which is 

— . . a 

Rate(f) = 

v ' l + a 

in the limit as m — > oo. By the amortized rate we mean that we ignore H with the understanding 
that it can be chosen and transmitted just once, all subsequent encryptions using the same H, so the 
amortized rate does not depend on h. 
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Figure 4: For various values of the crossover probability p of the adversary BSC, we show the limiting 
amortized rate and dual-channel limiting amortized rate achievable by XtX["%, ld u , ld/j_j_ TO ]. 



We note another improvement. As we saw above, security is not affected by ChA2, the channel over 
which W is sent. (We have already sent H upfront, so only W goes over this channel.) The sender 
could thus transmit W separately over a clear channel. The adversary would be in full possession 
of W but security is not reduced. The advantage is that transmission in a way that renders the 
adversary's reception noisy is much more expensive than transmission in the clear, and sending W 
becomes essentially for free. Let us refer to this as the dual-channel setting and let Rate2 denote the 
corresponding rate, which is 

I ! I 

Rate 2 (£) = — 



u 2s + m 
In the limit as m — > oo this is 



Rate 2 (£) = a . 

In Figure H] we illustrate, showing, for a few choices of p the corresponding rates. The rate obviously 
decreases with p but the salient fact is that we can get security for any positive value of p, even if the 
rate is quite low. 

We note yet another improvement, namely that the 1 — p term in Theorems 15.61 and 15.81 can be 
replaced with 1 — 2p+p 2 by using a version of Lemma [5. II in which the guessing probability is replaced 
by a collision probability. Our numbers do not reflect this. 



5.8 Reduction to error-free channels 

Above we applied Theorem 15.31 in the case where the adversary channel is the binary symmetric one. 
Here we look into applying it for other channels. 

In general Enl is chosen to correct errors over the receiver channel, and it may be hard to bound its 
rs-r advantage. But in the case that the receiver channel is error-free, Enl is just the identity function 
ld u , and bounding rs-r advantage should be much easier, as we saw above. Here we aim to generalize 
this approach. For a wide class of adversary channels and choices of Enl, we will show how to reduce 
the problem of bounding the rs-r advantage of Enl over the given channel to the problem of bounding 
the rs-r advantage of ld u over the same channel. This effectively makes security independent of the 
details of the ECCs. More specifically, what we show is that for systematic codes (most codes are 
systematic or can be made so), and for appropriately splittable adversary channels (all binary channels 
fall in this category), the rs-r security of the code over the given channel depends only on the amount 
of redundancy in the code and the rs-r security of the identity function over the same channel. 

Let us now proceed to the details. An ECC Enl: {0, 1}" — > {0, l} ni is systematic if there is a 
redundancy function Rd: {0, 1} U {0, l} n ^- u such that Enl(77) = U\\Rd(U) for all U E {0, 1}". Then 
we have the following: 

Theorem 5.9 Let Enl: {0, 1}" — > {0, l}" +r be a systematic ECC with redundancy function Rd: 
{0, 1}" {0, l} r . Let Ch u : {0, 1}" {0, l} a and Ch r : {0, l} r -> {0, l} b be channels, and let ChAl = 
Ch u ||Ch r : {0,l} u+r -> {0,l} a+fe . Then 

Adv rs - r (Enl;ChAl) < 2 r • Adv rs ' r (ld M ; Ch u ) . I (15) 
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In the case that ChA is the binary symmetric channel, we could combine Proposition 15.51 and The- 
orem 15.91 to bound the rs-r advantage of a given Enl assuming it was systematic. Specifically, if 
Enl: {0,1} U — >• {0, l} u+r is a systematic ECC and ChA = BSCp +r then we would get Adv rs " r (Enl; ChA) 
< 2 r (l — p) u . This is not as good a bound as Proposition 15.71 In the case of BSCs, Theorem 15.91 is 
thus not as effective as a direct analysis. But it is more general, and for channels other than BSCs it 
may be hard to directly analyze rs-r security of Enl. In this case Theorem 15.91 will be helpful. 

To prove Theorem l5.9l we first need a few definitions and lemmas. If Chi: D\ -4 R\ and Ch2: D2 — > 
R2 are channels with R\ C D2 then their composition Ch = Cli2 o Chi is the channel Ch: D\ — > R2 
defined by Ch(x) = Cli2(Chi(x)) for all x £ D\. We say that channel Ch3 is a degradation of channel 
Chi if there is a channel Ch2 such that Ch3 = Cli2 o Chi. 

Lemma 5.10 Let Enl: {0,1}" -4 {0, l} ni be a function. Let Chi: {0, l} ni -4 {0, 1} C and Ch 2 : 
{0, 1} C — > {0, l} dl be channels and let Ch = Ch2 o Chi. Then 

Adv rs " r (£;Ch) < Adv rs - r (f;Chi) (16) 

Proof: Let random variable U be uniformly distributed over {0, 1}". It suffices to show that 

maxPrL4(Ch(Enl(U))) = U] < maxPr[fi(Chi(Enl(U))) = U] . (17) 

To show this, let A be any adversary. We define adversary B_a via B_a{Z) = A{Qc\2{Z)) for all Z. 
Then for any A we have 

Pr[^(Ch(Enl(U))) = U] = Pr[^(Ch x (Enl(U))) = U] . (18) 

Now take the max over all A of both sides of f)18|) to get 

maxPr[^(Ch(Enl(U))) = U] = maxPr[i3^(Chi(Enl(U))) = U] 

< maxPr[£(Chi(Enl(U))) = U] , 

the last inequality because the max over all B includes those of the form Bj±. \ 

The following lemma about guessing probabilities is a corollary of [15, Lemma 2.2] but for completeness 
we give a direct proof. 

Lemma 5.11 Let X, R, Z be random variables such that R takes on at most N > 1 values. Then 

GP(X|Z,R) < AT-GP(X|Z). I (19) 



Proof: We have 



where 



GP(X|Z,R) = £Pr[R = r,Z = *]./(r,z) (20) 



f(r,z) = maxPr[X = x I R = r, Z = z ] 

X 

Pr[X = x, R = r,Z = z] 



max 



max 



max 



Pr[R = r ,Z = z] 
Pr[X = x, R = r\Z = z] ■ Pr[Z = 

Pr[R = r|Z = z) ■ Pr[Z = z\ 
Pr[X = x,R = r\Z = z] 



x Pr[R = r|Z = z] 
Pr[X = x\Z = z] 

< max —\- \- f . (21 

x Pr R = r\Z = z] v ; 
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From fl2DJ) and ((21]) we have 

GP(X|Z,R) < 



< 

This concludes the proof. I 
Now we proceed to prove Theorem 15.91 

Proof of Theorem 15.9. ChAl = Ch u ||Ch r is a degradation of Ch u ||ld r hence by Lemma 15.101 we 
have 

Adv rs - r (Enl;Ch) < Adv rs - r (Enl; Ch u ||ld r ) . 
We proceed to upper bound the latter and thereby establish (fT5|) as follows: 

Adv rs - r (Enl;Ch u ||ld r ) = GP(U|Ch u (U), Rd(U)) (22) 

< 2 r -GP(U|Ch u (U)) (23) 
= 2 r - Adv rs - r (ld M ;Ch n ) . 

Since Enl is systematic with redundancy function Rd we have Enl(U) = U||Rd(U), which justifies (|22p . 
Random variable R = Rd(U) takes on at most 2 r values. Let X = U and Z = Ch u (U) and apply 
Lemma ETQ to get fl33J). I 
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A Related Work 



This section provides a comprehensive survey of the large body of work related to wiretap security, 
and more broadly, to information-theoretic secure communication in a noisy setup. 

The wiretap setting and the secrecy capacity. For sake of clarity in comparing existing 
results, it is convenient to consider two different variants of the notion of secrecy capacity addressed 
by the wiretap literature. Given a pair of channels (ChR, ChA), their weak secrecy capacity C w is the 
supremum of the rates achievable by pairs (£,T>) consisting of an encryption scheme £ = {£k}keN 
and a decryption scheme V = {£>fc}fc e N for £ and ChR such that, first, the decoding requirement is 
satisfied, i.e., 



and moreover, 



lim DE(£ fe ;X> fe ;ChR) = . 

Adv rcs (£ fc ;ChA) , . 

lim '- = , (24) 

fe— >oo k 

a property which we refer to as weak security in the following. Additionally, the strong secrecy capacity 
C s < C w is obtained where we restrict the supremum over those schemes achieving RES security, i.e., 

lim Adv rcs (£ fc ; ChA) = , (25) 

k— >oo 

It should be noted that both these quantities exhibit some drawbacks making them less appealing to 
cryptographers: First, the supermum is taken over schemes that are not necessarily computationally 
efficient. Moreover, both definitions only consider uniformly distributed messages. Finally, the privacy 
requirement of C w , i.e., weak security, permits complete leakage of a fc 1_<E plaintext bits, which is almost 
always unacceptable. In fact, cryptographic applications call for an even stronger capacity notion 
where Adv ds (£ fe ; ChA) is a negligible function in k (e.g., 2 ®( fc )), and that this rate is achievable by an 
efficiently implementable scheme. Our main technical result can be interpreted as showing that such 
stringent capacity notion is not smaller than C w for many settings of interest, and it attained by an 
efficient and explicit scheme, which we provide. 



Earlier WORK on the secrecy capacity. The wiretap scenario was first considered by Wyner 
who provided a full characterization of C w in the special case where ChA is a degraded version of ChR, 
i.e., such that there exists a transform T with To ChR = ChA, where the composition operator o is the 
straightforward generalization of function composition to randomized transforms. Wyner's result was 
later generalized by Csiszar and Korner [13], who showecH that for arbitrary channels ChR : X — > Y 
and ChA : X -> Z, 

C w = max [I(U; ChR(X)) - I(U; ChA(X))] , 
u ,x 

where the maximum is over all pairs of correlated random variables U and X, with the latter variable 
taking values from the channel domain. If both ChR and ChA are symmetric, the above simplifies to 
(cf. e.g. [26j for a proof) 

C w = H(U|ChA(U)) -H(U|ChR(U)) (26) 

where U is uniform on X. Therefore, in the most common setting of ChR = BSC Pj? and ChA = BSC PA 
(for pr < pa), (j26]) yields C w = /^(pa) — h^ipit)- Finally, we note that more recently, Bloch and 
Laneman [6] have extended these results to the strong secrecy capacity C s . Their work also considered 
other capacity measures related to security metrics different than those considered in this paper; 
however, all of these notions only cover random-input adversaries. 

We stress that all aforementioned results are inherently non- explicit: That is, existence of secrecy- 
capacity achieving schemes is proven via the probabilistic method, and the resulting scheme is neither 



2 Csiszar and Korner actually considered a general setting where the outputs of ChR and ChA are correlated. However, 
we note that as long as communication is one-way, such correlation is irrelevant and one treats both (marginal) channels 
individually for correct decription and for security, respectively. 
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explicitly given, nor it is guaranteed to be efficient. In fact, to date, only a handful of efficient schemes 
are known. 



Wiretap channel II. Ozarow and Wyner [37) also considered an alternative to the above wiretap 
setting (called the wiretap channel II) where ChR is noiseless, but at the same time, Eve can learn a 
fraction 5 of the bits sent over ChR, and does not learn anything about the reamining (1 — 5) fraction. 
Solutions were presented relying on error-correcting codes |37|, [44J. Also, the notable work of [10] 
noted the such protocols with good parameters can be built from primitives such as deterministic 
randomness extractors for symbol-fixing sources with efficient inversion [23] , as well as from A;- wise in- 
dependent functions [25j and related tools from exposure-resilient cryptography, such as all-or-nothing 
transforms [H [T7] . 



Information-theoretic key agreement. A technically and conceptually related line of work 
considers the setting of information-theoretically secure key agreement from noisy primitives where 
Alice and Bob can additionally interact via a noiseless public channel. The typical question is to study 
the secret-key rate, i.e., the maximal achievable ratio by a secret-key agreement scheme between the 
number of uses of the noisy primitive and the number key bits geneated by the protocol. (In general, 
there is no bound on the amount of communication over the public channel.) Note that key agreement 
and encryption are equivalent in this setting. 



Maurer |30j introduced the problem and first showed that in Wyner's wiretap setting with ChR = 
BSC Pfl and ChA = BSC PA , a higher secret-key rate h,2{pR + PA — %PrPa) — ^2(ph) is achievable than in 
the pure wiretap setting. He later generalized [31] the question to a setting where Alice, Bob, and Eve 
are given each independent samples of random variables X, Y, and Z, respectively. This seminal work 
was followed by several results following different directions, such as improving techniques used in these 
protocols [U \7\ SO] , understanding the achievable secret-key rates in different settings [Ml EHJ [21] as well 
as extending to a setting where the channel between Alice and Bob is not authentic [331 1351 [211 118} 19]. 



We also note that many of the tools developed in this area found interesting applications in the 
context of cryptography based on biometrics (starting from |15j). 



Other wiretap models. We note that there is an active area of research aimed at understanding 
the secrecy and the secret-key capacities in even more settings, such as bi-directional wiretap settings 
(cf. e.g. pQ), multi-party settings, and channels with continous noise. We refer the reader e.g. to [5] 
for a survey of these results. 
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B Proofs for Section [4] 
B.l Proof of Lemma 14.21 
Proof: We have 

I(M; C) = - p M(m) lg P M (m) + £ P c (c) £ P M |c= c M lg P M \c=c( 

m cm 

= J M,c(rn,c) lgP M (m) + ^ Ju,c{m, c) ^ lg P M |c=c(™) 

m,c c m 

= Jm,c(?TI,c) • (lgP M (» - lgP M |C=c("l)) 

m,c 

= ~ 2^ J M,c{m,c) - lg 

= ^^M,c(w,c) • 1. 

m,c 

= V./M,c(m,c) - lg , , 

m,c v ' v 

= D(J MC ; -^m,c) 

as desired. | 



^M|C=c(» 

f M|c=e(ra) 
Pivi(m) 
^M,c(^,c) 
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